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Title of the Invention 



§t Prevention and Detection of IP Identification Wraparoxmd Errors 

ft Background of the Invention 

15 

16 7. Field of the Invention 
17 

1 8 This invention relates to reassembly of data fragments of fragmented datagrams in 

1 9 a communication system. In particular, the invention relates to reducing and/or detecting a 

20 likelihood of misassembly of data fragments in a commxmication system utilizing the Internet 

21 Protocol (IP) caused by IP identification wraparound. 
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1 2. Description of the Related Art 
2 

3 The Internet Protocol (IP) has become one of the most widely used 

4 communication protocols in the world. IP is part of a layered protocol, which means that another 

5 higher level protocol typically uses IP for data communication. Examples of such higher level 

6 protocols are the Transfer Control Protocol (TCP) and the User Datagram Protocol (UDP). In 

JZ addition, even higher level protocols are sometimes utilized, such as the Network File System 

D 

1^' (NFS). These protocols are well known to those skilled in the art. The protocols are used to 

^ send data from a sending station (e.g., a client or a server on a sending end of a communication) 

^ to a receiving station (e.g., a client or a server on a receiving end of a communication), possibly 

i 1 through one or more routing devices that form an IP path. 

% In order to send a TCP, UDP or other protocol datagram across an IP connection, 

Q 

H the datagram is encapsulated in an IP datagram. Often, the IP datagram must be fragmented into 

15 plural IP data fragments in order to be sent using the physical network. For example, if a size of 

16 the datagram exceeds the physical link's maximum transfer unit (MTU), that datagram must be 

17 fragmented into plural IP data fragments with sizes that do not exceed the MTU. Then, a 

1 8 receiving station reassembles the data fragments into the datagram. 
19 

20 A receiving station determines that data fragments belong to a single IP datagram 

21 by looking at an IP identification number in a header of each data fragment. All data fragments 

2 
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1 from the same IP datagram share the same IP identification number. In addition, the header of 

2 each data fragment includes an offset from the start of the datagram, a length of the data 

3 fragment, and a flag that indicates whether or not the datagram includes more data fragments. 

4 This information is sufficient for reassembly of the IP datagram, which includes the original 

5 TCP, UDP or other protocol datagram. 
6 

7 According to IP, the IP identification number is 16 bits long with a range of 0 to 

O 

65535. A sending station conventionally uses a simple counter to determine the IP identification 

& number for each IP datagram. In the early days of IP communications, a receiving station most 

m 

yro likely would receive all data fragments of a datagram with a particular IP identification number 

"5l and reassemble the datagram well before this counter could wrap around. If a data fragment was 

§2 lost, thereby making reassembly of a datagram impossible, all received data fragments of that 

y s 

; i3 datagram would be discarded after a timeout of 64 seconds. With the slower communications 

g4 times that existed in the early days of the IP communications, this timeout was usually sufficient 

1 5 to ensure data fragments would be discarded before the counter at the sending station could wrap 

1 6 around. 
17 

18 However, today's Internet communications are much faster. Gigabit and 100Mb 

19 Ethernet implementations are commonplace, and faster implementations are constantly being 

20 developed. As the communications speed increases, the number of IP datagrams sent by a 

21 sending station per unit of time also increases. Thus, the simple 16-bit counter conventionally 

3 
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1 used to generate IP identification numbers wraps around much more quickly. In fact, in a high 

2 speed setting, the counter can almost be guaranteed to wrap around within 64 seconds. Thus, a 

3 receiving station can receive data fragments from two different IP datagrams that share a 

4 common IP identification number before a first one of those datagrams is reassembled. 
5 

6 Because of the nature of IP communications, it is possible for a data fragment 

7 from a second one of two datagrams to arrive at the receiving station before a corresponding data 
fragment from a first one of the two datagrams. Then, if the two datagrams share a common IP 

# identification number due to wraparound of the sending station's IP identification number 

% counter, the receiving station can misassemble the data fragments. This misassembly can result 

W in corruption of the datagram. 

I 

III For example, if first datagram A is fragmented into data fragments Al, A2, A3, 

© A4 and A5, and second datagram B is fragmented into data fragments Bl, B2, B3 and B4, it is 

15 possible for a receiving station to receive the data fragment B2 before data fragment A2. Then, if 

16 datagram A and datagram B share a common IP identification number due to wraparound of the 

17 sending stafion's IP identification number counter, the receiving station can misassemble data 

18 fragments Al, B2, A3, A4 and A5 into a datagram, which of course would not contain the proper 

19 data. 
20 
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1 Higher level protocols such as TCP and UDP utilize checksums and length checks 

2 in an attempt to catch such data corruption. However, the UDP checksum is only 16 bits long. It 

3 has been found that in a high speed environment, IP misassembly errors might occur with 

4 sufficient frequency that eventually a "false positive" checksum can result. In this case, the 

5 checksum can mdicate that the UDP datagram has been properly reassembled, while in fact the 

6 datagram has been corrupted. Other properties of conventional IP exacerbate this situation, such 

7 as IP's acceptance of overlapping data fragments during datagram reassembly. In a UDP 

5 communication setting, these types of errors can lead to undetected data corruption. This data 

^ corruption might only come to light when the data is actually utilized, a situation that preferably 

m 

ITO should be avoided. 

^2 Summary of the Invention 

to 

y I 
Q 

g4 The invention addresses the foregoing concerns by implementing measures 

1 5 designed to reduce a likelihood of misassembly of received data fragments from fragmented IP 

16 datagrams. In addition, the invention implements measures designed to detect when a likelihood 

17 of such misassembly is high so that appropriate corrective policies can be implemented. 
18 

19 One embodiment of an aspect of the invention is a method of generating IP 

20 identification numbers for IP datagrams. In this embodiment, a plurality of IP identification 

2 1 number generators is maintained. A plurality of receiving stations are associated with the 
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1 plurality of IP identification number generators such that each receiving station has an IP 

2 identification number generator associated therewith. An IP identification number for a 

3 datagram sent to one of the receiving stations is generated based on an output of the associated IP 

4 identification number generator. This method preferably is performed by an IP layer of a sending 

5 station's communication system. 
6 

^ By using plural number generators, this aspect of the invention slows down 

wraparound of IP identification numbers used for communication with any given receiving 



station. 

m 
m 

i 1 Preferably, each of the IP identification number generators has at least one 

Q 

receiving station associated therewith. At least one of the IP identification number generators 
% preferably has plural receiving stations associated therewith. In one embodiment, the plurality of 

y 

ft IP identification number generators forms an array of number generators such as 16-bit counters. 

15 Preferably, the plurality of IP identification number generators is associated with the plurality of 

16 receiving stations by hashing destination addresses for the receiving stations and, in one 

17 embodiment, protocols for transmitting to those receiving stations so as to form an index to the 

1 8 array. If the hashing includes protocol information, the hashing preferably is performed such that 

19 at least half of the number generators in the array are associated with UDP protocol 

20 communications. 
21 
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1 An embodiment of another aspect of the invention is a method of reducing a 

2 likelihood of misassembly of data fragments from fragmented IP datagrams. In this method, data 

3 fragments of a datagram having an IP identification number are received. All received data 

4 fragments of the datagram are discarded upon detection of receipt of an overlapping data 

5 fragment having the IP identification number, wherein the overlapping data fragment overlaps 

6 data in an already-received data fragment. The overlapping data fragment can overlap all or less 

7 than all of the already-received data fragment(s). This method preferably is performed by an IP 
layer of a receiving station's communication system. 

# 

m 

% An embodiment of another aspect of the invention also is a method of reducing a 

fl likelihood of misassembly of data fragments from fragmented IP datagrams. According to this 

pi method, a timeout for reassembling the datagrams is reduced to less than a standard timeout. 

Preferably, the datagram reassembly timeout is reduced to 45 seconds from the standard timeout 

O 

^4 of 64 seconds. Alternatively, the datagram reassembly timeout is dynamically reduced based on 

15 NFS data for round-trip times between a sending station and a receiving station. This method 

16 preferably is performed by an IP layer of a receiving station's commimication system. 
17 

1 8 Yet another aspect of the invention is embodied in a method of reducing a 

19 likelihood of misassembly of data fragments from fragmented IP datagrams. This method 

20 includes the steps of receiving data fragments of a datagram having an IP identification number, 

2 1 and reducing a remaining time for reassembling the datagram upon detection of a gap in the 
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1 received data fragments. Preferably, the remaining time for reassembling the datagram is 

2 reduced to eight seconds. This method also preferably is performed by an IP layer of a receiving 

3 station's communication system. 
4 

5 An additional aspect of the invention is embodied in another method of reducing a 

6 likelihood of misassembly of data fragments from fragmented IP datagrams. According to this 

7 method, data fragments of a first datagram are received, v/ith the data fragments each having a 
% protocol identification number, a source address, and a first IP identification number. A 

in 

.If remaining time for reassembling the datagram is reduced upon detection, before receipt of a last 

m 

P data fragment of the first datagram, of a data fragment of a second datagram having the protocol 

Q identification number and the source address but having a second IP identification number. 

^ Preferably, the remaining time for reassembling the datagram is reduced to eight seconds. This 

01 

b method also preferably is performed by an IP layer of a receiving station's communication 

m 

3|1 system. 
15 

16 A fiirther aspect of the invention is embodied in a method of detecting a 

1 7 likelihood of misassembly of data fragments from fragmented IP datagrams. In this embodiment, 

1 8 communication errors between a sending station and a receiving station are detected. The 

19 likelihood of misassembly is determined to be high upon detection that the commimication errors 

20 occur at a high rate for a predefined period of time. The communication errors that are detected 

21 can include communication errors detected by an IP layer of the receiving station's 
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1 communication system. Such IP communication errors include, but are not limited to, receipt of 

2 overlapping data fragments and IP datagram reassembly timeout errors. The communication 

3 errors that are detected also can include conmiunication errors detected by a UDP layer of the 

4 receiving station's communication system. Such UDP conmiunication errors include, but are not 

5 limited to, UDP length errors and UDP checksum errors. The communication errors that are 

6 detected also can include communication errors detected by an NFS layer of the sending station's 

7 communication system. 

m 

^ Preferably, upon detection that the likelihood of misassembly is high, policies are 

W implemented to reduce the likelihood of misassembly of data fragments. Examples of 

W implementations of such policies include, but are not limited to, preferentially using TCP instead 

ri 

S of UDP, using additional checksums and presenting a warning message to a system 

N' 

16 administrator. 

■cr : 

s 

1 5 Another aspect of the invention is embodied in a method for a sending station to 

1 6 detect a likelihood of misassembly at a receiving station of data fragments from fragmented IP 

1 7 datagrams. This method includes the steps of determining a rate at which an IP identification 

1 8 number generator associated with the receiving station wraps aroimd, and determining that the 

19 likelihood of misassembly at the receiving station is high upon determination that the IP 

20 identification number generator wraps around at faster than a predetermined rate. Preferably, die 
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1 predetermined rate is once every ninety seconds. Alternatively, NFS re-transmissions also are 

2 considered when determining if a likelihood of datagram misassembly is high. 
3 

4 Policies are preferably implemented to reduce the likelihood of misassembly of 

5 data fragments upon determining that the likelihood of misassembly is high. Examples of such 

6 policies include, but are not limited to, preferentially using TCP instead of UDP, using additional 

7 checksums, and presenting a warning message to a system administrator. When the sending 
^ station maintains plural P identification number generators, such policies also can include 



LJ 



# reducing a number of receiving stations associated with the IP identification number generator 

m 

WD that is wrapping around at faster than the predetermined rate. 

ft 

Each of the foregoing methods can be used in conjunction with the others in 

01 

5 t 

yl^ various combinations to reduce and/or to detect a likelihood of misassembly of IP datagrams. 
The invention also includes apparatuses such as sending and receiving stations configured to 

15 perform the foregoing methods, computer readable code by itself or embodied in a computer 

16 program product to cause a computer to perform the foregoing methods, and a memory storing 

1 7 information including instructions executable by a processor to perform the foregoing methods. 
18 

19 This brief summary has been provided so that the nature of the invention may be 

20 understood quickly. A more complete understanding of the invention may be obtained by 
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reference to the following description of the preferred embodiments thereof in connection with 
the attached drawings. 



Brief Description of the Drawings 

Figure 1 is a representational view of communication between a sending station 
and a receiving station across a network such as the Internet. 

Figure 2 is a representational view of a sending station using plural identification 
number generators to generate IP identification numbers. 

Figure 3 is a representational view of a receiving station discarding a datagram 
upon detection of an overlapping data fragment. 

Figure 4 is a representational view of a receiving station discarding a datagram 
upon detection of a partially overlapping data fragment. 

Figure 5 is a representational view of a reduced timeout for reassembling 
datagrams at a receiving station. 
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1 Figure 6 is a representational view of a receiving station reducing a remaining 

2 time for reassembling a datagram upon detection of a gap in received data fragments of the 

3 datagram. 
4 

5 Figure 7 is a representational view of a receiving station reducing a remaining 

6 time for reassembling a datagram upon detection of a data fragment from another datagram 

7 having that same source address and protocol as the datagram but a different IP identification 
7^ number. 

i 

18 Figure 8 is a flowchart for explaining determination that a hkelihood of 

misassembly of datagrams is high upon detection of a high rate of communication errors for a 

Q period of time. 

y ' 

La 



3g Figure 9 is a flowchart for explaining determination that a likelihood of 

1 5 misassembly of datagrams is high upon determination that an IP identification number generator 

1 6 wraps around at faster than a predetermined rate. 
17 

18 Description of the Preferred Embodiment 

19 

20 In the following description, a preferred embodiment of the invention is described 

21 with regard to preferred process steps and data structures. However, those skilled in the art 

12 
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1 would recognize, after perusal of this application, that embodiments of the invention may be 

2 implemented usmg one or more general purpose processors or special purpose processors 

3 adapted to particular process steps and data structures operating under program control, that such 

4 process steps and data structures can be embodied as information stored in or transmitted to and 

5 from memories (e.g., fixed memories such as DRAMs, SRAMs, hard disks, caches, etc., and 

6 removable memories such as floppy disks, CD-ROMs, data tapes, etc.), with the information 

2 including instructions executable by such processors (e.g., object code that is directly executable, 

'{^ source code that is executable after compilation, code that is executable through interpretation, 
etc.), and that implementation of the preferred process steps and data structures described herein 

m 

IJ using such equipment and structures would not require undue experimentation or further 

1 1 invention. 

o 

m 

]fl Fig. 1 is a representational view of communication between a sending station and 

W a receiving station across a network such as the Internet. In Fig. 1, sending station 1 sends 

1 5 information across network 2 to receiving station 3. 
16 

17 Sending station 1 can be a client sending data to a server, a server sending data to 

1 8 a client, or any other device or entity sending data across network 2. Likewise, receiving station 

19 3 can be a server receiving data from a client, a client receiving data from a server, or any other 

20 device or entity receiving data across network 2. 
21 

13 



103.1043.01 

1 A single device, such as a client or a server, can be both a sending station and a 

2 receiving station, possibly simultaneously. For example, in typical two-way data 

3 communications between a client and a server, the client is a sending station for communications 

4 sent to the server and a receiving station for communications received from the server. Likewise, 

5 the server is a receiving station for communications received from the client and a sending 

6 station for communications sent to the client. 

7 

O 

Sending station 1 communicates through a layered communication protocol. 

ul 

Preferably, the layered communication protocol includes application layer 5, higher level layer 6 

Ul 

such as a Network File System (NFS) layer, transport layer 7 such as a Transfer Control Protocol 
fl (TCP) layer, User Datagram Protocol (UDP) or other protocol layer, and Internet Protocol (IP) 

3 

B layer 8. Various other combinations of layers are possible. For example, some sending stations 

yi 

do not have higher level layer 6. Also, particular types of layers are designed to work with other 

yi 

types of layers. For example, NFS was originally designed to work with UDP, not TCP. Finally, 

15 some applications directly utilize the lower level UDP or IP layers, thereby bypassing much of 

16 the error checking (e.g., checksum computations) provided by the higher level and application 

1 7 layers. As data passes through each of the layers from an application program, each layer 

1 8 performs operations on the data such as encapsulation. 
19 

20 Application layer 5 provides an interface for application programs to send data. 

2 1 Application layer 5 might compute and add a checksum to the data. Such a checksum is useful 

14 
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1 for ensuring data integrity at a receiving station. However, the application layer does not have to 

2 use any such checksum. 
3 

4 Higher level layer 6 such as an NFS layer typically keeps track of network data. 

5 This layer also can add a checksum, although such is not mandatory. 
6 

7 Transport layer 7 packages data in datagrams. Each datagram typically includes a 

i header and data. The data may be of various lengths. The header typically includes source and 

,19 address information, datagram length, and a checksum. For example, UDP specifies that a UDP 

£f datagram has a header with a 16 bit source port number, a 16 bit destination port number, a 16 bit 

® UDP length, and a 16 bit checksum. The checksum is for both the datagram's header and data, 

M as well as for a pseudo-header that includes additional information (IP source address, IP 

m 
if I 

destination address, protocol, and datagram length). 

yi 



15 IP layer 8 encapsulates UDP, TCP or other protocol datagrams into IP datagrams 

16 in order to send those datagrams across network 2. Often, an IP datagram must be fragmented 

17 into plural IP data fragments in order to be sent across network 2. For example, if a size of the 

18 datagram exceeds a known maximum transfer unit (MTU) for network 2, that datagram must be 

19 fragmented into plural IP data fragments with sizes that do not exceed the MTU. 
20 
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1 IP layer 8 generates an IP identification number for each IP datagram. All data 

2 fragments from the same IP datagram share the same IP identification number. In addition, the 

3 header of each data fragment includes an offset from the start of the datagram, a length of the 

4 data fragment, and a flag that indicates whether or not the datagram includes more data 

5 fragments. This information is sufficient for a receiving station to reassemble the IP datagram, 

6 which includes the original TCP, UDP or other protocol datagram. IP datagrams also include a 

7 checksum, but only for the header information. 

i 

i = : 

# According to IP, the IP identification number is 16 bits long with a range of 0 to 

W 65535. In a high-speed communications setting, a conventional sending station might send many 

fl more than 65535 datagrams in a short period of time, causing this IP identification number to 

y wrap around quickly. Thus, a sending station might send data fragments from two different 

datagrams with the same IP identification number to the same receiving station. This duplicate 

^ IP identification number can cause the receiving station to try to misassemble some of these data 

1 5 fragments into a single datagram. 
16 

17 In a setting where only a UDP data checksum is used to verify data integrity (e.g., 

18 an application checksum is not used or is bypassed and TCP is not used), some of these 

19 misassembled datagrams can slip through the weak 16 bit UDP checksum. This problem is 

20 exacerbated by the fact that the UDP checksum's strength is data-type dependent, resulting in 

2 1 similar checksums for similar types of data. For example, a corrupt datagram resulting from 

16 
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1 misassembly of mismatched text data fragments has better than a 1 :65535 chance of resulting in 

2 a checksum that matches the checksum for the original datagram. 
3 

4 In order to reduce a likelihood of wraparound of IP identification numbers for 

5 datagrams sent to a particular receiving station, an IP layer according to the invention utilizes 

6 plural IP identification number generators, as discussed below with respect to Fig. 2. 
7 

y In order to help determine when a likelihood of misassembly of datagrams is high, 

a sending station according to the invention can monitor for a high rate of conraiunication errors 
that might be the result of datagram misassembly, as discussed below with reference to Fig. 8. 

IQ The sending station also can monitors the IP identification number generator(s) for rapid 

0 wraparound, as discussed below with reference to Fig. 9. 

01 

1 n 

B Returning to Fig. 1, network 2 preferably includes a plurality of routers 10. 

u 

15 Examples of network 2 include the Internet, an intranet, an Ethernet network, and any other 

16 network or virtual network that utilizes IP communications. The particular configuration of 

1 7 network 2 is representational only of the inclusion of many routers and many possible 

1 8 communication paths through network 2. This configuration has no other significance, and any 

19 other configuration that allows communications through network 2 can be utilized with the 

20 invention. For example, network 2 could be replaced with a single router 10 between sending 

21 station 1 and receiving station 3. 

17 
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1 Each of routers 10 can have an MTU smaller than the sizes of data fragments sent 

2 to that router. If a router receives a data fragment larger than the router's MTU, the router can 

3 fiirther fragment the data fragment. Each data fragment of a datagram can take a different path 

4 through network 2. The routers along these different paths can have different MTUs. Thus, data 

5 fragments of a single datagram received by a receiving station can have different sizes. 
6 

7 Receiving station 3 also communicates through a layered commimication 

% protocol. Preferably, the layered communication protocol includes layers corresponding to layers 

S in sending stations that might send data to the receiving station. Thus, in Fig. 1, the layered 

m 

m communication protocol of receiving station 3 includes IP layer 12, transport layer 13 such as a 

ifl TCP, UDP or other protocol layer, higher level layer 14 such as an NFS layer, and application 

S layer 15. Various other combinations of layers are possible, and some applications directly 

d utilize the lower level UDP or IP layers, thereby bypassing much of the error checking (e.g., 

in 

B checksum verifications) provided by the higher level and application layers. As data passes 

1 5 through each of the layers to an application program, each layer performs operations on the data 

1 6 such as decapsulation. 
17 

18 IP layer 12 reassembles data fragments into datagrams based on IP identification 

19 numbers, length data and flags in the headers of those data fragments. Reassembly time for a 

20 datagram is limited by a timeout. If datagram reassembly time from when a first data fragment of 

18 
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1 a datagram is received exceeds the timeout, all data fragments associated with the datagram are 

2 discarded. 
3 

4 IP layer 12 also verifies a header checksum for received unfragmented datagrams 

5 and data fragments, but this checksum only verifies the integrity of the associated IP headers. 

6 This checksum therefore does not generally help prevent or detect data fragment misassembly, at 

7 least because such misassembly can occur with completely self-consistent IP headers. 

I 

In order to reduce a likelihood of misassembly of data fragments from different 

W datagrams that have the same IP identification number, an IP layer of a receiving station 

IQ according to the invention can take several actions. The IP layer can discard all data fragments of 

B a datagram if an overlapping data fragment is received, as discussed below with reference to 

m 

b Figs. 3 and 4. The IP layer also can reduce a timeout for datagram reassembly. The overall 

m 

P timeout can be reduced, as discussed below with reference to Fig. 5. In addition, the time for 

1 5 reassembly can be dynamically reduced if a gap is detected in received data fragments of a 

16 datagram, as discussed below with respect to Fig. 6, or if a data fragment from another datagram 

17 with a different IP identification number is received from the same source with the same 

18 protocol, as discussed below with respect to Fig. 7. 
19 

20 In order to help determine when a likelihood of misassembly of datagrams is high, 

21 a receiving station according to the invention can monitor for a high rate of communication 

19 
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errors that might be the result of datagram misassembly, as discussed below with reference to 
Fig. 8. 

Transport layer 8 strips the TCP, UDP or other protocol header off of a datagram, 
as appropriate. Both TCP and UDP can verify a checksum for the resulting data. However, as 
noted above, the UDP checksum is relatively weak. It should be noted that the TCP checksum 
also is not perfect. Corrupt data sometimes passes the TCP checksum, albeit with significantly 
less frequency than with the UDP checksum. 

The length of the datagram preferably also is verified by the transport layer. 
However, many length errors are corrected in the IP layer's datagram reassembly. For example, a 
750 byte data fragment inserted into a space for a 500 byte data fragment during datagram 
reassembly typically will not result in a UDP or TCP length error because the IP layer truncates 
overlong data fragments. Thus, length error checking also may not help catch datagram 
misassembly. 

Higher level layer 14 preferably works in conjunction with higher level layer 6 in 
sending station 1 to keep track of and to manage network data. Higher level layer 14 also can 
provide data integrity verification through checksums, although use of such checksums is not 
mandatory. 
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1 Application layer 1 5 provides an interface for an application program to receive 

2 data. This layer also optionally can provide a checksum and error checking. 
3 

4 As is evident from the discussion above, many layers of an layered protocol used 

5 for network commimications can provide checksums and other error detection measures. 

6 However, one common method for network communications is to have application programs 

7 directly communicate using UDP and IP. The only data checksum in this configuration is the 
f UDP checksum, which is weak enough that it might miss some misassembly of data fragments. 

The invention provides techniques for decreasing the likelihood of such misassembly, as well as 

in 

f3 for detecting when a likelihood of misassembly is high. 

8 

g. Fig. 2 is a representational view of a sending station using plural identification 

m 

i% number generators to generate IP identification numbers. 

m 
S 

15 Briefly, IP identification numbers for IP datagrams are generated. To generate 

16 these identification numbers, a plurality of IP identification number generators are maintained. A 

1 7 plurality of receiving stations are associated with the plurality of IP identification number 

1 8 generators such that each receiving station has an IP identification number generator associated 

19 therewith. An IP identification number is generated for a datagram sent to one of the receiving 

20 stations based on an output of the associated IP identification number generator. Preferably, the 
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1 IP identification numbers are generated in an IP layer of a sending station's communication 

2 system. 
3 

4 In more detail, Fig. 2 shows sending station 1 with array 17 of N plural IP 

5 identification number generators 1 8. N preferably is a power of two to simplify indexing and 

6 hashing, which are discussed below. Examples of N are 16 and 256. Each of IP identification 

7 number generators 18 preferably is a 16-bit counter, corresponding to the 16 bits needed for an IP 
identification number. 

1^ In order to associate a receiving station with an EP identification number 

ij generator, sending station 1 preferably uses the receiving station's address. Optionally, sending 

P station 1 also uses the protocol for a particular datagram to be sent to that receiving station. 

m 

B Preferably, the datagram's transport protocol (i.e., TCP, UDP or other protocol) is used for this 

m 

Q protocol. As shown in Fig. 2, receiving station address and protocol 20 for a datagram are 

1 5 hashed by hash 2 1 to form index 22 to array 1 7. 

16 

17 In the preferred embodiment, there are more than N possible combinations of 

18 receiving station addresses and protocols. In fact, sending station 1 may send data to more than 

19 N separate receiving stations. Therefore, more than one receiving station can be associated with 

20 each of IP identification number generators 18. 
21 
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Hash 21 preferably is designed so that IP identification number generators 18 are 
distributed fairly evenly among the receiving stations. Preferably, if there are more than N 
receiving stations, each of plural IP identification number generators 18 has at least one receiving 
station associated therewith. 

Furthermore, because UDP tends to be more susceptible to datagram misassembly 
than other transport protocols, hash 21 preferably is designed so that half of IP identification 
number generators 18 are associated with UDP. The other half of IP identification number 
generators 18 preferably are associated with all other protocols. Thus, each receiving station 
preferably will have an IP identification number generator associated therewith for UDP 
datagrams and an IP identification number generator associated therewith for all other protocol 
datagrams. This feature of hash 21 can be implemented by including a "UDP/non-UDP" bit in 
hash 21. 

Whenever sending station 1 needs to send an IP datagram to a receiving station, 
sending station 1 preferably sends receiving station address and protocol 20 for that datagram 
through hash 21 to form index 22. Index 22 is then used to index to one of the plural IP 
identification number generators 18, which provides the identification number and then 
increments (or vice versa). 
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1 By virtue of the foregoing arrangement, a single IP identification number 

2 generator is not shared among all receiving stations. Rather, each of plural IP identification 

3 number generators is shared only among the associated receiving station/protocol combinations. 

4 Wraparound of IP identification numbers for datagrams sent to a particular receiving station 

5 using a particular protocol thereby tends to be greatly slowed, reducing a likelihood that data 

6 fragments from two datagrams having the same IP identification number will be sent to the same 

7 receiving station before reassembly timeout. 

^5 Figs. 3 to 6 are representational views that illustrate various techniques by which a 

receiving station can further reduce a likelihood of misassembly of data fragments from two 

% different datagrams. In each of these figures, a data fragment is represented by a small box. A 

^ letter in the box represents a datagram to which the data fragment belongs, and a number in the 

m 

lis box represents the data fragment's position in the datagram. A small numeral to the upper right 

M of each box indicates an order in which the data fragments have been received by the receiving 

o 

1 5 station in each illustrated example. A data fragment that has not been received does not have 

16 such a numeral and is designated by a broken outline (see, e.g., data fragment A2 in Fig. 3). 

17 Finally, a numeral under each data fragment indicates a size of the data fragment in bytes. 
18 

19 It should be noted that the details shown in Figs. 3 to 6, such as specific orders, 

20 sizes and compositions of data fragments and datagrams, are provided solely to clarify aspects of 

21 the invention discussed with respect to each figure and are for illustrative purposes only. The 
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1 invention is in no way limited to those particular details, as will be apparent to one skilled in the 

2 art. 
3 

4 Fig. 3 is a representational view of a receiving station discarding a datagram upon 

5 detection of an overlapping data fragment. 
6 

7 Briefly, a likelihood of misassembly of data fragments from fragmented IP 

^ datagrams is reduced. Data fragments of a datagram having an IP identification number are 

1=9 received. All received data fragments of the datagram are discarded upon detection of receipt of 



if an overlapping data fragment having the IP identification number, wherein the overlapping data 

© fragment overlaps data in an already-received data fragment. Preferably, this technique is 

-S performed by an IP layer of a receiving station's communication system. 

<^ " 

m 

M In more detail, receiving station 3 in Fig. 3 has received data fragments Al, A3, 

U 

15 A4 and A5. Data fragment A2 has not been received. Subsequently, data fragment Bl has been 

16 received. Datagrams A and B have identical IP identification numbers in Fig. 3, for example as a 

17 result of an IP identification number generator wrapping around in a sending station that sent 

18 datagrams A and B. Thus, data fragment Bl overlaps data fragment Al. In other words, if data 

19 fragment Bl was assembled into a datagram with data fragment Al, some data from one of the 

20 data fragments would overlap data from the other data fragment. This overlapping corresponds 
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1 to a situation where misassembly can occur, for example if a data fragment B2 was subsequently 

2 received. In Fig. 3, data fragment Bl overlaps all of data fragment Al . 
3 

4 According to the invention, receiving station 3 determines that data fragment Bl 

5 has overlapped data fragment Al . Upon determining that such overlapping has occurred, a 

6 receiving station according to the invention discards all received data fragments for the datagram 

7 with the overlapped data fragment. In Fig. 3, the datagram with overlapped data fragment Al is 
S datagram A, so data fragments Al, A3, A4 and A5 are discarded. The invention similarly would 

-5 have discard all received data fragments for datagram A if another of its data fragments had been 

\ ^ 

IQ overlapped instead of data fragment Al , for example data fragment A3, A4 or A5. 

y I 

a 

a 

g By virtue of the foregoing operation, a sending station discards data fragments 

y before misassembly can occur in some situations. 

a 

Q 

1 5 Fig. 4 is a representational view of a receiving station discarding a datagram upon 

16 detection of a partially overlapping data fragment. In Fig. 4, receiving station 3 has received data 

17 fragments Al, A3, A4 and A5. Data fragment A2 has not been received. Subsequently, data 

1 8 fragment B2 has been received. Data fragment B2 has a size of 750 bytes, versus the 500 byte 

19 size of data fragments Al, A3, A4 and A5. Such a difference in data fragment size can occur, for 

20 example, if data fragment B2 traveled across network 2 along a path that had an MTU of 750 
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bytes, while the rest of the data fragments traveled across network 2 along paths with MTUs of 
500 bytes. 



Datagrams A and B have identical IP identification numbers in Fig. 4. In this 
situation, the first 500 bytes of data fragment B2 do not overlap any received data fragments of 
datagram A. However, the last 250 bytes of data fragment B2 do overlap part of data fragment 
A3. A receiving station according to the invention preferably would detect this overlap and 
would therefore discard data fragments Al, A3, A4 and A5. Thus, the invention preferably 
discards data fragments of a datagram when any data in any of those data fragments is overlapped 
by any data in a subsequently received data fragment with the same IP identification number. 

By virtue of the foregoing operation, a receiving station discards data fragments 
before misassembly can occur in more situations than if only overlap of entire data fragments 
was acted upon. 



Fig. 5 is a representational view of a reduced timeout for reassembling datagrams 
at a receiving station. 



Briefly, a likelihood of misassembly of data fragments from fragmented IP 
datagrams is reduced by reducing a timeout for reassembling the datagrams to less than a 
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1 standard timeout. Preferably, this techiuque is performed by an IP layer of a receiving station's 

2 communication system. 
3 

4 In more detail, receiving station 3 in Fig. 5 has received data fragments Al and 

5 A3 of datagram A and data fragment B2 of datagram B. Data fragment A2 has not been 

6 received. Datagrams A and B have identical IP identification numbers. Accordingly, data 

7 fragments Al , B2 and A3 could be misassembled into a corrupted datagram as long as the data 
^ fragments were all received before a timeout for datagram reassembly using data fragments Al 
m and A3. 

m 

In Fig. 5, time line 23 illustrates a standard IP datagram assembly timeout of 64 

12 seconds. Data fragment B2 is received within this time frame, so if this standard timeout was 

ji used, misassembly could occur. However, a receiving station according to the invention 

01 

M preferably uses a reduced timeout such as that illustrated by time line 24. This timeout ends 

a t 

15 before receipt of data fragment B2, preventing any chance of misassembly in the example shown 

16 in Fig. 5. 
17 

18 A timeout of 45 seconds has been found to produce good result in terms of 

19 allowing enough time for proper datagram reassembly while preventing some datagram 

20 misassembly. Alternatively, the timeout could be determined based on network data for expected 

21 communication (e.g., round-trip) times between a particular sending station and a particular 
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1 receiving station. Such network data preferably could be provided by an NFS layer of each 

2 station's communication system. 
3 

4 Fig. 6 is a representational view of a receiving station reducing a remaining time 

5 for reassembling a datagram upon detection of a gap in received data fragments of the datagram. 
6 

7 Briefly, a likelihood of misassembly of data fragments from fragmented IP 

S datagrams is reduced. Data fragments of a datagram having an IP identification number are 

m 

# received. A remaining time for reassembling the datagram is reduced upon detection of a gap in 

% the received data fragments. Preferably, this technique is performed by an IP layer of a receiving 

W station's communication system. 

s 

1^ In more detail, receiving station 3 in Fig. 6 has received data fragments Al and 

Q 

H A3 of datagram A. Data fragment A2 has not been received, creating a gap in the received data 

1 5 fragments. The gap can be detected by examining the lengths and offsets included in the headers 

16 of the received data fragments. This gap indicates that data fragment A2 might have been lost in 

1 7 transit, opening up the opportunity for a data fragment from another datagram to be improperly 

1 8 inserted into this gap during reassembly. Accordingly, a receiving station according to the 

19 invention reduces an amount of time left for receipt of the missing data fragment and reassembly 

20 of the datagram. 
21 
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Reducing the remaining reassembly time to eight seconds in such a situation has 
been found to produce good results. Eight seconds has been found generally to allow enough 
time for receipt of a data fragment that has been merely delayed, while generally not allowing 
enough time for transmission of another datagram with the same IP identification number as the 
datagram with the gap. 

Of course, if the remaining time before timeout is less than eight seconds, only the 
remaining time is allowed before timeout. In other words, the remaining time is not increased to 
eight seconds if it is already less than eight seconds. 

By virtue of the foregoing operation, fewer opportunities for datagram 
misassembly tend to occur. 

Fig. 7 is a representational view of a receiving station reducing a remaining time 
for reassembling a datagram upon detection of a data fragment from another datagram having 
that same source address and protocol as the datagram but a different IP identification number. 

Briefly, a likelihood of misassembly of data fragments from fragmented IP 
datagrams is reduced. Data fragments of a first datagram are received, with the datagram having 
a protocol identification number, a source address, and a first IP identification number. A 
remaining time for reassembling the first datagram is reduced upon detection, before receipt of a 
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1 last data fragment of the first datagram, of a data fragment of a second datagram having the 

2 protocol identification number and the source address but having a second IP identification 

3 number. Preferably, this technique is perfomaed by an IP layer of a receiving station's 

4 communication system. 
5 

6 In more detail, one problem with attempting to detect a gap in data fragments is 

7 that IP does not provide enough information to directly detect loss of a last data fragment or 
fragments of a datagram. In particular, IP data fragments indicate if they are or are not a last data 
fragment. The only indication of relative positions of intermediate IP data fragments are offsets 

m 

rp[ from a start of the datagram. These offsets provide no information about how many data 

ij fragments follow a given data fragment. Thus, if a last data fragment of a first datagram is lost, a 

Q receiving station only knows that it has received some data fragments of the first datagram but 

si! 

t3= has not yet received a last data fi*agment. In this situation, the receiving station might receive 

Q before timeout a data fragment from another datagram that happens to match the first datagram's 

15 IP identification number. This occurrence can lead to misassembly of the first datagram. 
16 

17 Typically, a sending station will send all of a datagram in a particular protocol to a 

1 8 particular receiving station before sending another datagram in that protocol to that receiving 

19 station. Thus, possible loss of a last data fragment of a first datagram can be indicated by receipt 

20 of a data fi-agment from a second datagram sent by the same sending station as the first datagram 

21 using the same protocol. A receiving station can tell that the data fragment is from the second 
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datagram by checking for a different IP identification number than that used by the first 
datagram. Accordingly, in order to help prevent misassembly, a sending station can reduce a 
time remaining for reassembling a datagram upon receipt of a data fragment from another 
datagram having that same source address and protocol as the datagram but a different IP 
identification number. 

Accordingly, receiving station 3 in Fig. 7 has received data fragments Al, A2, A3 
and A4 of datagram A, but not last data fragment A5. Receiving station 3 has no way of 
knowing if data fragment A5 is a last data fragment of datagram A. Receiving station 3 has 
subsequently received data fragment CI of datagram C. Datagrams A and C have different IP 
identification numbers. Therefore, no overlapping can occur. Also, data fragments from 
datagram C will not be misassembled with data fragments from datagram A (barring other 
processing errors). 

However, because datagram C shares the same source address and protocol as 
datagram A, it is likely that the last data fragment or fragments of datagram A have already been 
sent to receiving station 3 and may be lost. Therefore, receiving station 3 according to the 
invention preferably reduces a remaining time for receipt of the last data fragment or fragments 
and reassembly of the datagram. Reducing the remaining time to eight seconds has been found 
to produce good results, allowing for receipt of merely delayed data fragments while still tending 
to prevent misassembly. 
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1 The foregoing methods are designed to decrease a likelihood of datagram 

2 misassembly. Even if misassembly occurs, UDP and other checksums probably will catch most 

3 of the misassembled packets. However, a misassembled packet eventually might slip past the 

4 checksums, especially if only UDP checksums are used, with possibly dire consequences for data 

5 integrity. Thus, the invention also provides techniques for detecting when a likelihood of 

6 datagram misassembly is high so that appropriate corrective action can be taken. 
7 

^ Fig. 8 is a flowchart for explaining determination that a likelihood of misassembly 

^ of datagrams is high upon detection of a high rate of commimication errors for a period of time. 

i 

g Briefly, a likelihood of misassembly of data fragments from fragmented IP 

^ datagrams is detected. In order to detect this likelihood, communication errors between a 

P sending station and a receiving station are detected. The likelihood of misassembly is 

O determined to be high upon detection that the communication errors occur at a high rate for a 

3 J 

-=? 

1 5 predefined period of time. 
16 

17 In more detail, step S801 in Fig. 8 detects if communication errors are occurring 

18 at a high rate for a period of time. This error detection can be performed at a sending station or at 

19 a receiving station, both with respect to the station itself and with respect to other stations, 
20 
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1 The types of errors indicative of datagram misassembly include IP layer 

2 overlapping errors, IP layer timeout, UDP length errors, UDP checksum errors, and NFS errors. 

3 Other errors also might be indicative of datagram misassembly. 
4 

5 IP layer overlapping errors can be flagged by a station's IP layer w^hen overlapping 

6 occurs as discussed above with respect to Figs. 3 and 4. Likewise, IP timeout errors can be the 

7 result of reduced timeout and reassembly times as discussed above with respect to Figs. 5, 6 and 
H 7. UDP length and checksum errors can be the direct result of datagram misassembly that is 

^ properly caught by UDP error checking mechanisms. NFS errors can be the result of 

w. 

P misassembly errors that slipped through the IP and UDP error checking mechanisms. NFS errors 

M can be implied from an increased rate of NFS re-transmissions. 

P 

IB Datagram misassembly and situations that create an opportunity for datagram 

m 

^ misassembly have been found to create sustained higher rates of one or more of these types of 

□ 

15 errors. Thus, if such errors are detected, flow proceeds from step S801 to S802, and it is 

16 determined that a likelihood of datagram misassembly at the associated receiving stations is high. 
17 

1 8 With a high likelihood of datagram misassembly comes an increased chance that a 

19 misassembled datagram will pass UDP's weak checksum. Accordingly, flow preferably 

20 proceeds from step S802 to step S803, where policies are implemented to decrease the likelihood 

21 of datagram misassembly and to increase a likelihood of catching misassembled datagrams. 
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1 Examples of the policies implemented in step S803 include using TCP instead of 

2 UDP, if possible. TCP avoids reliance upon IP fragmentation. In addition, additional checksums 

3 can be used. These additional checksums can include application checksums that are much 

4 stronger than those typically used by communication protocols, possibly incorporating extremely 

5 strong hashing functions such as MD5 and SHI . If UDP checksumming is turned off, it can be 

6 turned on. NFS, application, and/or other checksums can be utilized, if possible. Furthermore, a 

7 waming can be sent to the system administrators of both the sending and the receiving stations so 

^- — -. 

that the source of the errors can be tracked down and corrected. 

m 

m Some of these policies may not be possible to implement in every situation. For 

example, if a server is implementing the policies, the server may not be able to dictate use of 

^ additional checksums to a client. Likewise, TCP may not be available between a particular 

01 

!D sending station and a particular receiving station. In these situations, policies preferably are not 

yi 

S implemented that prevent communications. Of course, if data integrity is essential, 

1 5 communications with a station that is experiencing high error rates can be discontinued. 
16 

1 7 Fig. 9 is a flowchart for explaining determination that a likelihood of misassembly 

1 8 of datagrams is high upon determination that an IP identification number generator wraps around 

19 at faster than a predetermined rate. 
20 
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1 Briefly, a sending station detects a likelihood of misassembly of data fragments 

2 from fragmented IP datagrams sent to a receiving station. In order to detect this likelihood, the 

3 sending station determines a rate at which an IP identification number generator associated with 

4 the receiving station wraps around. The likelihood of misassembly at the receiving station is 

5 determined to be high upon determination that the IP identification number generator wraps 

6 around at faster than a predetermined rate. 
7 

y In more detail, in step S901 of Fig. 9, a sending station determines a rate at which 

its IP identification number generator(s) wrap around. In step S902, it is determined if this rate 
P exceeds a predetermined threshold for any particular IP identification number generator. A 
fl predetermined threshold of 90 seconds has been found to work well. 

5 

01 

If an IP identification number generator wraps around at faster than the 
S predetermined rate, a possibility exists that two datagrams having the same IP identification 

1 5 number will be sent to a receiving station before the first of the two datagrams times out, thereby 

16 creating an opportunity for datagram misassembly. Accordingly, if the threshold is exceeded, 

1 7 flow proceeds to step S903. In step S903, it is determined that a likelihood of datagram 

1 8 misassembly at the associated receiving stations is high. 
19 

20 Flow then proceeds to step S904, where policies are implemented to address the 

2 1 high likelihood of datagram misassembly. For example, if the sending station is utilizing plural 
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1 IP identification number generators, the association between the number generators and the 

2 receiving stations can be changed so that fewer receiving stations are associated with the number 

3 generator that is wrapping around too quickly. With reference to Fig. 2 above, one technique of 

4 changing this association is to change hash 2 1 . 
5 

6 In addition, policies along the lines of those discussed above with respect to step 

7 S803 in Fig. 8 also can be implemented. Again, unless data integrity is essential, only those 
policies that can be implemented without discontinuing communications preferably are 

Ul 

,f9 implemented. 

5 

m 

\Q Alternative Embodiments 

01 

ft Each of the techniques discussed above can be used in conjunction with the 

m 

3^ others. For example, a sending station can check for communication errors in conjunction with a 

o 

1 5 high rate of IP identification number generator wraparound. Other combinations of the foregoing 

16 techniques are possible. Thus, while preferred embodiments of the invention are disclosed 

17 herein, many variations are possible which remain within the content, scope and spirit of the 

1 8 invention, and these variations would become clear to those skilled in the art after perusal of this 

19 application. 

20 // 

21 // 
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